CORNELL UNIVERSITY LIBRARY 




924 073 896 650 



An Absolute Point Scale for the 

Group Measurement 

of Intelligence 









■ t 


BY 


■ ^''' 
#/ 






ARTHUR S. OTIS 

Surgeon General's Office, Washington, D. C. 


f 







-, -^f.' 



''f^WvS'' 



I 



^Mi 



? r 

(Reprint from The Journal of Educational Psycholog*, Vol.=rX, l5Ss'.¥, 6, May- June 1918) 



I, ' 






'''IH.\<\'\^ 



08 



@i75ir 



•'Mg^,.' IHMlah 



-** y|f ' 



AN ABSOLUTE POINT SCALE FOR THE GROUP 
MEASUREMENT OF INTELLIGENCE. 

ARTHUR S. OTIS 
Surgeon General's Office, Washington, D. C. 

Contents 
Part I. 
I Introduction: Purpose. 
II The Tests. 

Requirements of a scale for mass testing. 
Description of the tests. 

III The Preliminary Investigation. 

IV Acquisition of the Data. 

Administration of the tests. 
Scoring. 

V The Reliability of the Scale. 

Probable errors of the test scores. 
Reliability coefficients. 

VI Graduation of the Scale. 

Theoretical considerations. 

Equating the scores. 

Weighting and combining the scores. 

Age norms. 

Completing the Absolute Point Scale. 

Coefficients of Brightness. 
Part II. (See June number) 
VII Overlapping of Ability between Grades. 
VIII Refinement of the Scale. 

The order of difficulty of the test elements. 

The diagnostic value of the single test elements. 
IX Inter-test Correlations. 
X Further Considerations regarding Reliability. 

The Reliability Coefficient of the Point Scale. 

The Probable Error of the Scale. 
XI Comparisons with School Mark and Amount of Schooling. 

Appendix I. Sample Extracts of Tests. 



*The writer is indebted to Dr. Lewis M. Terman of Stanford University for many 
helpful suggestions during the making of this study. ,.« 

(5) 



6 THE JOURNAL OF EDUCATIONAL PSYCHOLOGY 

Appendix II. Showing the Point Scores of Each Pupil in Each 
Test. 

Appendix III. Some Mathematical Reasoning with Regard to 
Criteria of Tests of Intelligence. 

Appendix IV. Inter-Test Correlations (Raw and Corrected). 

References. 

I. Introduction 

Purpose — The purposes of this study are: 

(1) To construct a scale for the measurement of general mental 
ability, such scale being: 

(a) suitable primarily for administration to the pupils in 
grades 4, 5, 6, 7, and 8 of the elementary school, 

(b) capable of being administered to groups of at least 50, 

(c) so constructed that the scoring is both rapid and, as far 
as possible, free from the error of the personal factor, 

(d) built upon the general plan for an Absolute Point Scale 
outlined by the writer (see Ref. 10). Such construction would in- 
volve the "validation" and "graduation" of the tests, the determin- 
ation of the probable error of a determined measure of general 
mental ability, etc. 

(2) To investigate the correlation of the mental abilities tested 
by the scale. 

It is not deemed feasible, in the space to which this article is 
limited, to attempt to discuss the nature of intelligence, such as it 
is presumed to measure by the present scale, nor the various de- 
finitions of intelligence which have been given or may be implied 
by the various 'intelligence scales' in present use. These subjects 
will be touched upon in various connections in the discussion. The 
writer will therefore proceed immediately to describe the manner 
in which the present scale was constructed. 

II. The Tests 

Requirements of a Scale for Mass Testing. — The chief object of 
testing in groups, of course, is economy of time. One of the most 
essential means of accomplishing this purpose is that the responses 
required be very simple. This makes for speed both in the adminis- 
tration and in the scoring of the tests. It has been the aim, therefore, 
so to arrange the tests in the present scale that there would be only 
one correct answer to each item and that this might be indicated 
merely by making a letter or figure or drawing a line. Where con- 
venient, provision was made for the responses to be placed in a 
single column in which case the papers may be scored with dispatch 
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by the use of scoring forms. When every answer is either right or 
wrong, a large amount of time is saved that might be necessary 
otherwise to determine the value of partially correct answers. 
Moreover, under these conditions the tests must be scored in the 
same way by all investigators, this assuring comparability. 

The ideas for the tests have been derived from various sources, 
chiefly, perhaps, from the Stanford Revision of the Binet Scale, 
(see Ref. 14). The general character of most of them is no doubt 
familiar.* In substance the remaining tests were designed especially 
for this study. 

Description of the Tests. — The scale was compiled in duplicate. 
There were, in other words, two complete tests of each kind. The 
two tests of each kind were made as nearly alike as possible without 
using the same material. In each scale the tests were constructed 
as follows: 

The Spelling Tesff consisted of fifty pairs of words in two columns. 
The words of each pair consisted of the correct and incorrect spell- 
ing of a single word. In some cases the first spelling was the cor- 
rect one and in some cases the second was the correct one. The 
pupil was required to indicate by the letters, F, S, or N, placed in a 
parenthesis opposite the words, as shown in the sample in Appendix 
I, whether the first or the second was the correct spelling, nor neithei- 
spelling was correct. 

The Arithmetic Test consisted of 16 problems in which the compu- 
tation was made as easy as possible and the emphasis thus placed 
upon reasoning. 

The Synonym and Antonym Tes^ consisted of 50 pairs of words as 
shown in the sample. The pupils were required to indicate by the 
letters, S and O, whether the words of a pair meant the same or 
the opposite. 

The Proverb Test consisted of 20 proverbs in two sets of ten prov- 
erbs, each set followed by twelve statements, one of which "ex- 
plained" each of the ten proverbs, there being two extra statements 
in each set not explaining the proverbs. The pupils were required 
to place in the parenthesis before each proverb the number of the 
statement which explained it. 

*Thanks are due to Mrs. Mary D. Chamberlain for the Proverb Test used in this 
study. The words of the Spelling Test, as explained later, were taken from Ayres' 
list. (See Ref. 2.) 

fAlthough the Spelling Test was found, in both the preliminary and in the present 
investigations, to afford quite as good a measure of intelligence, from one point of 
view, as the other tests (see intercorrelations), it has since been considered best to 
drop this test from the scale. 



8 THE JOURNAL OF EDUCATIONAL PSYCHOLOGY 

The Disarranged Sentence Test consisted of 26 sentences with the 
words disarranged, as shown in the sample. The pupils were 
required to rearrange the words mentally to make sense and indicate 
whether the sentences so constructed were true or false by under- 
lining the words true or false at the end of the line. 

The Relation Test consisted of 24 items, each ii; the form of a pro- 
portion in which one of the four terms was to be supplied, indicating 
by number from five alternative answers given on the same line. 

The Geometric Test consisted of 22 items, using as a basic principle 
that described by Abelson. (See Ref. 1.) Referring to the figures 
constructed by overlapping one or more circles, triangles, and rec- 
tangles, the pupils were required to place figures 1, 2, etc., in cer- 
tain designated spaces as suggested in the sample. 

The Following Directions Tesi consisted of 14 problems requiring 
the pupils to place certain numbers in certain figures on the Wood- 
worth and Wells Cancellation Sheet. This test presumably differ- 
ed from the preceding one in that its difficulty consisted more in 
the comprehension of involved language while in the Geometric 
Test, the difficulty lay chiefly in tracing out the space relations. 

The Narrative Completion Test was of the type used by Whipple, 
Ebbinghaus, Terman, and others. It consisted of a short story of 
which certain words were omitted leaving blanks in which the pupils 
were required to write the words which in their judgment best fit- 
ted into the story. 

III. The Preliminary Investigation 

To aid in choosing the tests for the Point Scale and in determin- 
ing the most suitable forms in which to give them, a preliminary 
investigation was conducted in which 29 pupils in grades 4 to 8 of 
a small school were tested. Fifteen tests were used. Eight of 
these were in the same or nearly the same form as those shown in 
the Appendix. Two others, the Arithmetic and Spelling Tests, 
have since been entirely made over. The Synonyms Test was given 
orally first, then about three weeks later repeated in the form shown. 
The other five tests were: a test in word meaning recognition, of 
the type suggested by the writer (See Ref. 8) and called the Read- 
ing Test; a test in the reproduction, in writing, of sentences dic- 
tated, called the Memory for Sentences Test; the Trabue Comple- 
tion Test (see Ref. 15); the Kansas Silent Reading Test (see Ref. 
4); and the Starch Grammar Test (see Ref. 12). The list is shown 
in Table I. The tests marked with the asterisk were given in dupli- 
cate, the double scores being used in the correlations. 
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The correlation of each test with the composite of all scores ex- 
cept those of the Oral Synonyms and Grammar Tests are shown in 
the first column of the table. The correlations with mental age 
as determined by the Stanford Revision of the Binet Scale are shown 
in the second colimin. Considering the small number of individuals 
as well as the unreliability of scores, the coefficients may be regarded 
as of suggestive value only. 

TABLE L 
Some Results of the Preliminary Investigation 

Correlation with Correlation with 
Test Composite Mental Age 

Relation .94 .97 

Proverbs* .94 .94 

Following Directions .86 .95 

Geometric* .89 .92 

Trabue Completion Test* .88 .88 

Reading* .92 .82 

Kansas Silent Reading Test .90 .88 

Synonyms (Oral) .79 .87 

Synonyms (Written) .83 .85 

Disarranged Sentences* .86 .81 

Narrative Completion .86 .80 

Arithmetic , .84 .80 

Spelling* .79 ,.84 

Memory for Sentences .77 .82 

Memory for Digits* .72 .42 

Starch Grammar Test .30 .49 

Correlation of Mental Age with Composite Score: .94 

Same, "corrected for attenuation" (estimate): ^99 

IV. Acquisition of the Data 

Administration of the Tests. — The tests were given to 121 children 
of a large grammar school — 43 in the fourth grade, 40 in the 
sixth grade, and 38 in the eighth grade. Each test was taken 
by all of the pupils of one grade at a time in their regular 
room, the teacher being present. The writer personally conducted 
all the tests, giving all the directions and explanations. The tests 
were given in approximately the order indicated in the foregoing 
section. In most instances two test series were given to each grade 
each day, one in the morning and one in the early afternoon. The 
giving of the first and second tests of the same kind were separated 
by three or more days in most instances but particularly in the cases 
of the Arithmetic, Geometric, and Following Directions Tests, in 

*Double scores used. 
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which the second test is in the nature of a recast of the first. Time 
was allowed for all to finish in nearly all cases, except, of course, the 
Disarranged Sentence Test, which is a speed test. Occasionally 
when one or two pupils lagged far behind the others, their papers 
were taken up before they finished. In such cases it was usually 
noted that the pupils had permitted themselves to be distracted 
from their work. In the Disarranged Sentence Test sufficient time 
was allowed for only one pupil to finish. The order, "Stop," was 
then given and the time noted. For purposes of comparison, all 
scores were afterward increased to a five minute basis. 

The pupils of each grade were adjured at the beginning of the 
testing not to give or receive aid during the taking of any tests. A 
wholesome attitude appeared to be taken by all during the testing. 
In such instances of apparent collusion as were noted, the pupils 
were quietly cautioned. These instances were few. On the whole 
the pupils were orderly and attentive and signified their interest 
in the testing. 

Scoring. — In the case of each test except the Synonyms, Spelling, 
and Disarranged Sentences, one count was given for each correct 
answer and no count for incorrect or omitted answers. In the case 
of Synonyms, however, since there are but two alternative answers, 
S or O, theoretically, of the answers given concerning the pairs of 
words not known by any pupil, but guessed at, one half will be right 
by chance. Therefore, if say 35 of the 50 were known and correctly 
marked, and 10 of the remaining 15 guessed at, leaving 5 blank; of 
the 10 guessed at, 5 might be marked rightly by chance. This 
would rhake 40 correct, 5 incorrect, and 5 blank. It seemed, there- 
fore, that as many counts should be deducted from the total cor- 
rectly marked (40) as were incorrect (5) thus giving a score of 40 
—5 = 35, the number assumed to be known. A person guessing at 
all of them and getting half right by chance would then attain a 
score of 25—25=0. This method was adopted in scoring the Syn- 
onyms and Disarranged Sentence Tests. The case of the Disar- 
ranged Sentences is complicated by the fact that sentences wrongly 
marked on account of haste are penalized additionally by the loss 
of time in scanning them. The suitability of the method, therefore, 
should perhaps be investigated. 

Since there are three possible answers, F. S, or N, in the Spelling 
Test, theoretically )4 of the number of those guessed at would be 
marked rightly by chance. This would mean that, to follow the 
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above method, the score should be obtained by deducting from the 
number rightly marked, yi the number of those wrongly marked. 
However, inasmuch as there were only fourteen individuals who did 
not attempt all the words, and to avoid possible negative scores, 
the scores were obtained by giving one count for each right answer, 
no count for each wrong .answer, and yi, count for each blank, on 
the assumption that if guessed at, yi, of these words would have 
been rightly marked, This brought all the scores to the same 
basis and necessitated counting only right answers in all but 14 
cases. Identical rank orders of the individuals are obtained from 
the scores by the two methods. The scores that would have been 
obtained by .using the first method can easily be derived from those 
used merely by multiplying by 1>^ and subtracting 25. 

In order to obtain a suggestion as to the value of the above meth- 
ods of scoring, the sum of the differences between the first and second 
scores of the 14 pupils above mentioned was found first when the 
scores were obtained as above and second when obtained by merely 
counting the number of correct answers, taking no account, there- 
fore, of the element of chance. The sum of the differences in the 
first case was 34 and in the second 45, although the scores were less 
in the second case. This suggests that the method employed was 
the more reliable. 

While there is, of course, a 'one-in-five chance' of an element of 
the Relation Test being marked rightly by guess, it was not deemed 
necessary to take account of it. In an auxilliary investigation re- 
garding the scoring of the Digit Test, the papers were scored (1) 
according to the number of digits in the last number correctly re- 
produced, (2) according to the number of digits in the next to the 
last number correctly reproduced, and (3) according to the last 
group of numbers of the same size of which two or more were cor- 
rectly reproduced. No one of these three methods appeared to be 
appreciably superior to the others. The reliability coefficient of 
scores by method (1) was .53, by the method employed in this study, 
.74. It was discovered after giving the test that some of the pupils 
were able to reproduce numbers of nine or more digits within three 
trials. If such had been included in the test, the reliability coeffic- 
ient by method (1) would no doubt be higher. It is believed, 
therefore, that with a sufficiently exhaustive test, the loss in reli- 
ability of method (1) would be more than made up by the great 
saving in time of scoring. 
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Briefly, the plans of scoring were as shown in Table II. 

TABLE IL 
Summary of Plans of Scoring 
Test Score 

Spelling 1 count for each correct answer and 

}4 count for each blank, (nearest whole number) 

Arithmetic 1 count for each correct answer. 

Synonyms 1 count for each correct answer and 

1 count deducted for each incorrect answer (blanks not 
counted). 

Memory for Digits 1 count for each number entirely correct. 

Proverbs 1 count for each correct answer. 

Disarranged Sen ences 1 count for each correct underlining witlj 

1 count deducted for each incorrect underlining. 

Relation 1 count for each correct answer 

Geometric 1 count for each figure 1 correctly placed, provided no 

other figure 1 appeared in the same design, and 
similarly, 
1 count for each figure 2 correctly placed. 

Following Directions 1 count for each direction correctly followed. 

Narrative Completion 1 count for each blank satisfactorily filled. 

' The scores for each individual in each test will not be given as obtained by the 
above plan but instead they will be given in an altered form explained below. The 
scores are given in Appendix IL 

V. The Reliability of the Scale 
The reliabiUty of a test may be expressed in two ways, either (1) 
by giving the probable error of a score in the units of the score, the 
probable error being the value of that error which is exceeded in 
amount by half the errors, or (2) in terms of the coefficient of cor- 
relation between two tests of the same kind. The probable error 
of a score as a measure of the reliability of a scale is comparable with 
other values of the probable error found in connection with the 
testing of other groups of individuals but it is not comparable with 
the probable error of the scores of other tests unless the units in 
the two scales measure the same increments of ability. This would 
not happen often and only accidentally. The reliability co fficient, 
as has been explained more fully elsewhere (see Ref. 11), ederived 
from measures of one group is not comparable with a reUability 
co9fficient for the same test derived from measures of another group 
unless the heterogeneity of the two groups is the same or nearly 
the same. In many instances, of course, this is not the case. The 
rehability coefficient of one test, however, is comparable with that 
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of another test when both are derived from measurements of the 
same group. The reliability coefficients are necessary under these 
conditions to show the relative reliabilities of two tests. They 
compensate the measures of reliability for inequalities of scale units. 
We have therefore found both the probable errors and the relia- 
bility coefficients of each of the ten tests. The probable errors of 
scores in the several tests were found according to the method de- 
scribed at length in Ref. 11. This method is expressed by the form- 
ula, 

P E _ ^^^- ^" 

in which Med. Dif. is the median difference between scores by the 
same individuals in the two tests of the same kind. (A test of the 
■first scale is called Test I; the corresponding test of the second scale 
Test II.) Before making the subtractions, however, it is necessary 
to have the scores of both tests in terms of either one or the other 
of the two tests, since these are quite often somewhat different; due 
to slight differences in difficulty, to practice effect, etc. For the 
purpose of evaluating the scores of one test in terms of the other, 
plots were made in which the scores in Test I were represented as 
abscissae (horizontally) and those of Test II as ordinates. The 
manner in which the scores in the two tests corresponded was then 
found by drawing in each plot a line of relation. This is such a 
line that the abscissa and ordinate of any point on it represent cor- 
responding scores in the two tests. By inspection of the plots, it 
was deemed valid to draw a straight line of relation in all cases 
except that of the Narrative Completion Test, in which it was ap- 
parent that the true line of relation was markedly curved. In 
that case the curve of relation was drawn by the method we have 
called the method of correspondence by rank (see Ref. 7). In all 
cases except that of the Narrative Completion Test, the line of re- 
lation was obtained by finding the means, Mx and My, of the values 
of X and y and the average deviation, A. D.x and A. D.y., of fhe 
distributions of values of x and y, and then drawing a line through 
the point (M^, My) having a slope such that the tangent of the 

A D 
angle formed with the X axis = T^-=r 

A. U.-x 

To find the score in terms of Test II which corresponds to any given 
score in terms of Test I, it is necessary merely to find the point on 
the relation line corresponding to the score in Test I and to note the 
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score in Test II at the left which corresponds to this point. The 
differences between the scores, in terms of Test II, are measured by 
the distances of the points of the plot above or below the line; in 
terms of Test I, by the distances to the right or left. The values of 
Med. Dif. were obtained by the method. (See Ref. 7) : 
Med. Dif. = .8453xAvg. Dif. 
That is, P. E. = .8453 (Avg. Dif.) 

1.414 
The values of the probable errors of each of the several tests were 
obtained first in terms of Test II and the corresponding values in 
terms of Test I were derived by dividing by the tangent of the angle 
of the line of relation. The values of the probable error in both 
terms are given in Table III. 

TABLE in. 
Reliabilily of the Tests 

Probable Errors Reliability Coefficients 
Scale I. Scale II. Single Tests Double Tests 

1. Spelling 1.49 1.45 .942 .970 

2. Arithmetic 74 .80 .871 .931 

3. Synonyms and Antonyms. . 1.96 1.86 .753 .8i 

4. Memory for Digits 1.04 1.21 .746 • .855 

5. Proverbs 1.17 1.02 .761 .864 

6. Disarranged Sentences 1.28 1.76 .737 .849 

7. Relation 1.50 1.86 .729 .843 

8. Geometric 1.22 1.15 .805 .892 

9. Following Directions 75 .97 .820 .901 

10. Narrative Completion 5.43 3.80 •840_ .913 

Total 8.877 



The formula used for finding the reliability coefficients was: 

^ A. D.(dirs) \ 2 

<A. D. (scores)/ 

which is a variation of the difference formula: 



j.^-^ T/f ^- l^-(difs) Y 

VA. D.fscores)/ 



r = l-^^ (''-^^ 



^2 



This latter formula is the equivalent of the Pearson product-moment 
formula. (See Ref. 11.) In these formulae, A. D.(dif» and 
<^(y-Xy) are measures of the variability of the distribution of 
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differences between the scores of each of the 121 pupils in Test I 
and Test II, when the scores in Test I are evaluated in terms of 
Test II; and in which A. D.(scor8s) and o-y are respectively cor- 
responding measures of the variability of the distribution of scores 
in Test II. 

The reliability coefficients thus found for each test are shown in 
the third column of Table III. 

We were quite surprised to find the Spelling Test to be so much 
in the lead in this rating. However, the Spelling, Narrative Com- 
pletion, and Synonym Tests had 50 elements while the other tests 
had only 25 or less. The Arithmetic, Following Directions, and 
Geometric Tests no doubt have an advantage over the others in 
that Test II was only slightly different from Test I. 

The aim in duplicating the tests,, as has been stated, was to make 
the second test in each case as nearly like the first as possible with- 
out actually copying it. This was done in order that the score in 
the second test would be as near as possible to a second score in 
the same test. It is possible that a second score in the same test 
would have been preferable for finding the reliability if it had been 
convenient to separate the two givings of the tests by a sufficient 
interval. Even this, however, would introduce new sources of 
error. Since the differences in difficulty between the two tests of 
a kind are not the same for all the pupils, the differences between 
the scores in the two tests tend to be greater than would be the case 
it the same test could be given twice, even without memory of the 
first testing, in which case the difference in the scores would be due 
merely to differences in disposition at the times of taking the first 
and second tests. For this reason, the values of the probable er- 
rors and reliability coefficients, considering only errors due to vary- 
ing disposition, are really less than those given here. 

Further consideration regarding reUability will be given later. 
These depend upon the values of inter-test correlations. 

VI. Graduation of the Scale 

Theoretical Considerations. — There are two aspects to the gradu- 
ation of the scale. One deals with the proper combining of the 
scores of the several tests and the other with the finding of age 
norms, percentage norms, etc. The scores of each individual in 
the ten tests must first be combined into a single score, say a "point- 
score," and then those point-scores may be determined which are 
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normal for each of the given ages of childhood, or those which given 
percentages of adults may be expected to attain, etc. 

In order that the scores of an individual in the several tests may 
be properly averaged, it is necessary to take account of the dif- 
ferences in value of the units .of the scales of the several tests. If 
an increment of one problem in an Arithmetic score is in reality 
equal to an increment of four words in the score of the Synonym 
Test, to average the scores in the two tests just as they stand would 
be to give the Synonym Test four times as much weight as the 
Arithmetic Test. If, therefore,, it is desired to give equal weight 
to each test, the score of an individual in each test must be trans- 
muted into other terms, say "points," such that equal increments 
of ability in each test receive equal increments of points. It is con- 
venient, also, while assigning point values to the scores in the sev- 
eral tests, to arrange that correspondihg amounts of ability in the 
several tests shall receive corresponding numbers of points. 
The first of these conditions is essential and the second convenient 
for the purpose of averaging scores properly; both conditions are 
essential for the purpose of comparing scores in the several tests with 
one another. If it seemed reasonable to assume that for the in- 
dividuals of a given group, the ability possessed by the upper 25% 
in any test was as much above that possessed by the upper 50% as 
that ability was above the ability possessed by the upper 75%, 
and if the ability in any one test was considered equal to that in 
any other test which was possessed by the same percentage of 
individuals, then the first of the above mentioned conditions would 
be complied with by representing the difference between upper 
25% and upper 50% ability in each of the several tests by some 
number of points (say 10), and the difference between upper 50% and 
upper 75% ability in each of the several tests by that same number 
of points (10). And the second condition would be complied with 
by representing 50% ability in all of the ten tests by the same 
number of points (say 50) in which case, of course, 25% abiUty 
would be represented in each case by 60 points and 75% ability by 
40 points. 

We have been speaking of the equality of increments of ability, 
but such equality is a very indefinite thing. Equal increments of 
ability must be such as are measured by the same number of units 
of some kind. We have not been willing to grant that the steps 
of any test scale necessarily measured equal increments of ability. 



AN ABSOLUTE POINT SCALE 17 

Nor would we admit that any year's growth in ability is equal to 
every other year's growth. The growth of ability is supposed to 
retard eventually with age. In what units then will we say ability 
may be measured so that equal numbers of units measure equal 
increments of ability? In a previous article (Ref. 10) we have 
suggested that absolute units of ability be so defined that the dis- 
tribution of abilities of all adults will be normal (in the technical 
sense). This would mean that those percentages of adults which 
were considered as possessing abilities which marked successive 
steps on an absolute scale of ability were the same percentages as 
those of the normal probability surface which corresponded to 
successive units of the base. Until such time,, however, as a very 
large number of unselected adults have been tested, such a criterion 
of equality of units of ability will be unavailable. In lieu of such a 
criterion, an alternative method was used. 

The Procedure Used for Determining Equality of Increments of 
Ability. — Although we have felt that the units in one part of a single 
test scale were very apt to be of greater value than those in some 
other part, it is quite probable that if the upper units of some test 
scale must be considered as measuring greater increments of ability 
than the lower units, the opposite probably might be considered 
true of some other test scale, so that taking the test scales all to- 
gether, the median value of the units in one part may be considered 
as equal to the median value of the units in any other part. Pro- 
ceeding upon that hypothesis, the most probable true form of the 
distribution of abilities of the 121 pupils was determined by obtain- 
ing a composite of the separate distributions for the ten tests as 
follows: 

1. The score attained in each test by the 30th individual in rank 
(beginning with tjie lowest) was assigned a preliminary point-value 
of 40 points and the score attained by the 90th individual in rank 
was assigned a preliminary point- valUe of 60 points.* 

2. Tentative point values corresponding to all the other scores 
were then determined in such a manner that the units in all parts 
of the test scale were represented by equal increments of points. 
This was accomplished graphically in each case by drawing a straight 
line. 

3. From the smooth curves of distribution of test scores were 



*These scores were not the actual scores of those individuals but the scores cor- 
responding to them on smooth curves through the distributions of consecutive corses. 
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then determined the scores attained by the 3rd, 9th, 15th, 60th, 
105, 111th, and 117th individuals in rank order. These points in 
the distribution curves were believed to best reveal any skewness 
of the distribution. 

4. The preliminary point values corresponding to the scores at- 
tained by the 3rd individual in each test distribution were then 
ascertained. These were then plotted in order of magnitude and 
a median value determined by means of a smooth curve through 
the plotted points. This median point value was 24.4. The 
other median point values were as follows: 



Individual in order: 


3, 


9, 


15, 


(30) 


60, 


(90) 


105, 


111, 


117 


Point value 


24.4 


29.7 


33.3 


(40) 


50.1 


(60) 


66.7 


70.1 


75 



It should be stated that these values indicate that the distribu- 
tion of abilities of the 121 pupils approximately normal. 
^ 5. Since the median of the preliminary point values obtained by 
the 3rd individual in rank in the several test distribtuions was 
24.4, this value may be assumed to be the most probable true 
value, in terms of our established absolute units, of the ability in 
any test which the 3rd individual in rank order attained. The 
score in each test attained by the 3rd individual in rank order (by 
the curve) was then given, therefore, the corrected point value 
24.4. Similarly the score in each test attained by the 9th individual 
was then given the corrected point-value 29.7, etc. 

6. In order to determine the corrected point value to be similarly 
assigned to all the other scores in each test, a graph was made for 
each test in which the preliminary point values corresponding to 
the scores attained by the 3rd, 9th, etc., individuals were plotted 
as ordinates and the new point values, 24.4, 29.7, etc., plotted as 
abscissae. A smooth curve was then drawn through the series of 
plotted points. This curve was then taken as shbwing the relation 
between the preliminary and corrected point values corresponding 
to each score in the test From this curve for each test were taken 
the corrected point values corresponding to each score. These 
are shown in Table IV. They no doubt represent the nearest 
approach that can be made to a true absolute point scale. 

Considerations with Regard to Weighting and Combining the Scores. 
— After finding the corrected point values corresponding to each 
test score, the scores of each pupil in each test were transmuted into 
terms of points and the total score found for each. These are given 
in Appendix II. 
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This method of combining the scores resulted in equal weight 
being given to each test. No doubt some of the tests are more 
significant than others in the measurement of general ability, how- 
ever we conceive it. Unreliability of a test, of course, lowers its 
significance. Other aspects of significance depend upon the con- 
ception of general ability. If a test is considered as measuring 
general ability only to the extent to which the factors entering into 
the ability tested are common to other abilities, both as to number 
of factors and as to number of abilities to which they are common, 
then the degree to which a test may be considered as measuring 
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general ability is expressed by the amount of "correlational spread" 
of the test, to use McCall's expression, by which is meant the sum 
of the intercorrelations of the test with other tests comprising a 
fairly representative collection, each presumed to involve factors 
common to the others. The last qualification is necessary since, 
if the group of tests is too restricted in kind, certain 'specific' abil- 
ities may be common to too large a proportion of the tests and thus 
vitiate the criterion of general ability.* 

On the other hand if a test is considered as contributing to the 
measure of general ability if it measures an ability that may be 
considered valuable in aiding the individual to adjust himself to 
the new problems and conditions of life, whether such ability has 
few or many factors in common with others; then it is not proper 
to use only the criterion of correlational spread. Two possible al- 
ternatives suggest themselves. If there were available for the 
individuals tested a satisfactory criterion of their powers of adapta- 
tion to the new conditions and problems of life, in the nature of 
a measure of economic or scholastic success, then it would be nec- 
essary merely to weight the tests according to the regression equa- 
tion method, so as to obtain the best correlation of the composite 
score with the criterion. In lieu of such a criterion, the tests might 
be weighted according to a combination of the weights assigned by 
a number of judges. In this study, for instance, the results of all 
the tests except that of Memory for Digits correlated uniformly 
highly with each other. The Digit Test, which showed a reliability 
not the least among the ten tests, stood quite apart from the other 
tests in showing low correlations with all of them. Recording to 
the criterion of correlational spread, this test would be weighted 
very much lower than any of the others. According to either of 
the criteria pertaining to the second conception of general ability, 
however, the Digit Test might perhaps deserv^ a weight more nearly 
the amount of the others. 



*Some mathematical reasoning bearing on this point is given in Appendix HI, 
1 and 2. 

McCall used this criterion in his study (Ref. 5). Another criterion which he also 
used was the correlation of each test with "Composite," a measure obtained by 
combining the scores of all the tests (with some exceptions) after weighting each 
according to a priori considerations as to the value of the tests. Although the cor- 
relations of the several tests with Composite appear tD have been determined by 
McCall by separate calculations, it would have been possible to obtain the values 
of these correlations with Composite more simply from the values of the inter-test 
correlations. The necessary procedure is given in Appendix III, 3. 
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Age 

toTooa-B-s 



I 10 11 12 13 14 IS 16 17 18 

Fig. 1 
Showing the Relation between Total Point Score and Age 



In this study we are inclined more to the second conception of 
general ability mentioned. It was not feasible in this study, how- 
ever, to use either of the criteria appropriate to this conception. 
To weight the tests according to reliability alone, it would be nec- 
essary to weight each inversely in proportion to the square of the 
probable error (the probable errors being in comparable terms). 
(See Merriman, Ref. 6, p. 95.) Such procedure, however, prac- 
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tically implies that all the tests aim to measure the same thing. 
But since they do not, any weighting given to compensate for diJBfer- 
ent degrees of reliability, necessarily also emphasizes the effect of 
certain particular abilities and is to that extent undesirable. 

For these reasons we have combined the test scores without 
weighting them. 

Finding Age Norms in Terms of Point Scores. — For finding age 
norms, a plot was made. (See Fig. 1.) One point pertains to each 
pupil. The abscissa of each point represents the pupil's age. and 
the ordinate his total point score. In order to find the score which 
would be considered normal for 10-year-olds, the average score was 
found of all pupils of ages from 9 years, no months, to and including 
11 years, no months; for 11-year-olds, the average score was found 
of all pupils of ages 10 years to and including 12 years, etc. The 
norms thus found were as shown in Table V. These values were 
then plotted. (See Fig. 2.) To our surprise, the points represent- 
ing the norms for ages 10 to 14 lay in almost a perfectly straight line, 
which suggests that they are fairly reliable, at least, for the school 
population tested. This was not expected considering the gaps 
left by omitting the fifth and seventh grades from the group tested. 
The norms for years 15, 16, 17 may be seen to fall below the line, 
the latter two quite markedly. This was to be expected, of course, 
since the pupils of these ages were selected, being retarded in their 
schooling. While the true norms for these ages are doubtless above 
the average values obtained, it was not deemed proper to continue 
the straight line. The line was therefore curved off to the right as 
shown. We must regard the norms for the ages above 15, as being 
only roughly approximate. 









TABLE V. 














Showing Age Norms in Point Scores 








Age: 
Point; 
Score: 
Norms: 


Observed: 
Smoothed; 


8 
324 


9 10 11 12 13 14 15 
404 446 487 527 566 583 

364 405 445 486 526 566 600 


16 
550 

624 


17 
584 

638 


18 
647 



19 
650 

Completing the Absolute Point Scale, — We have previously (see 
Ref. 10) given the name. Coefficient of Brightness, to the quotient 
that would be obtained by dividing the measure of the absolute 
amount of mental ability of any individual by the measure of the 
absolute amount of mental ability which was normal for the age of 
that individual. This means, of course, that the measures of 
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10 11 12 13 14 15 16 17 18 19 



Fig. 2 
Showing a Smooth Curve through the Age Norms of Total Point Scores 

mental ability must be in such terms that not only will equal in- 
crements of ability be measured by equal increments of the scale, 
but twice as many units on the scale will represent twice as much 
ability, etc. In other words, zero of the scale must represent just 
absence of ability. Before it was possible for us to find the coeffic- 
ients of brightness of the pupils tested in this case, therefore, it 
was required to note what correction was necessary in the scale of 
points in order that the number of points representing the abiUty 
of age would be 0. The ages for which we may presume to have 
obtained fairly reUable norms are only those from 10 to 14. Inas- 
much, however as the increments of points between the norms for 
these ages are almost exactly the same, it was regarded as proper to 
assume for present purposes that if continued, the line through the 
norms would be straight the rest of the way to age zero. It was 
then necessary to note what number of points thus corresponded to 
age zero, this number to be considered the. true absolute zero of 
the final point scale. To our further surprise, it was discovered 
that by calling the yearly increment of points (below 14) approx 
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imately 40.5 a line, which would pass as nearly as any other through 
the five norms, actually reached zero age at zero of the point scale. 
This, of course, was an entirely accidental coincidence and not at 
all necessary. It merely saved us the obligation of subtracting or 
adding a constant to each of the corrected point values assigned to 
the several test scores in order to obtain the final point values con- 
stituting the completed Absolute Point Scale. 

The Determination of the Coefficients of Brightness. — Since the 
point values in which the scores of the pupils were expressed proved 
to be those of the Absolute Point Scale, in order to find the coefific- 
ients of brightness of each pupil, it was necessary merely to divide 
the total point score of each by the score which was normal for his 
age. The norms for the fractional ages were taken from the curve in 
Fig. 2. The coefficients of brightness thus found are given in Ap- 
pendix II. 

Appendix I. 

Sample Extracts of Tests: 

Test 1: Spelling 

1 . forenoon fournoon ( F ) 

2. intrest interest (S) 

3. heighber neighbor ( ) 

4. concider consider ( ) 

5. entertain . entertane ( ) 

etc. etc. etc. 

Test 2: Arithmetic 
1 . If a boy has 10 cents and then earned 5 cents, how much did he 

have then? ( 

7. How many years will it take a glacier to move 1000 feet at the 

rate of 100 feet a year? ( 

15 . A ship has provision to last her crew of 50 men 6 months. How 
long would it last 30 men? ( ) months 

Test 3: Stosionyms and Antonyms 

1 • large big ( S ) 

2. decrease increase (O) 

3. empty vacant ( ) 

4. knowledge ignorance ( ) 
50. conservative radical ( ) 



) cents 
) years 



Test 4: Memory for Digits 



1. 4739 

2. 2854 

3. 7261 

4. 31759 

5. 42385 

6. 98157 



( 






( 






( 






) ( 






) ( 






) ( 




) \ t ) \ ) 



VII. Overlapping of. Ability Between Grades 
The points in Fig. 1 belonging to pupils in the eighth grade were 
made as circles, those belonging to pupils in the sixth grade, crosses, 
and those belonging to pupils in the fourth grade, triangles. It 
will be noted that there is considerable overlapping between the 
grades even though they are not consecutive. 

The average score of the fourth graders is 385, of the sixth graders, 
514; and of the eighth graders, 605. Suppose we call the norm 
for the third grade 320, the norm for the fifth grade 450, and for 
the seventh grade, 565, as shown in Fig. 1. We then find 8 fourth 
graders out of 43 above the fifth grade norm. Presumably these 
could do satisfactory fifth grade work. We find 1 of these 8 above 
the sixth grade norm. And we find 4 fourth graders below the third 
grade norm. The scattering of the three grades is shown in Table VI. 



Norms: 



TABLE VI 

Showing the Overlapping between the Grades 
3rd 4th 5th 6th 



7th 



8th 



Fourth Grade (43) 4 
Sixth Grade (40) 
Eighth Grade (38) 



19 


12 


7 


1 






5 


11 


15 


4 






3 


5 


10 



20 



Another rather interesting fact concerning the distributions of 
scores, particularly in the sixth grade, is that there is a tendency 
for the more mature* pupils, intellectually, to be the younger ones. 
It would seem from this and the many other similar investigations 
that this is invariably the case. The most mature pupil in the 
sixth grade is, in fact, next to the youngest, while the oldest is next 

*It has been necessary in this case to avoid the use of the ambiguous word, in- 
telligence, which is used by nearly all writers on mental testing to mean both matur- 
ity, irrespective of age, and brightness — maturity with respect to age. The state- 
ment that, in a single grade, the youngest are also the most intelligent, according to 
the second meaning, would be a mere platitude. This would be true even if there 
were zero correlation between age and maturity. 

(25) 
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to the least mature. In the eighth grade, also, the yoiongest is 
more mature, intellectually. There is, in other words, a negative 
correlation between age and maturity in the single grades. If 
pupils were graded according to intellectual maturity only, there 
would be no appreciable correlation, positive or negative, between 
age and maturity in a single grade. The fact of negative correla- 
tion, therefore, suggests strongly that some bright pupils (mature 
but young) have been .held back by the inelastic system of grading 
and that dull pupils have been promoted beyond their ability. 
This is one of the evils which mental testing should eventually 
remedy. 

VIII. The Refinement of the Scale 

Finding the Order of Difficulty of the Elements 0/ the Tests. While 
not essential, it is nevertheless very desirable to have the elements 
of each test arranged in the order of difficulty. The relative de- 
grees of difficulty of the elements of a test are, of course, probably 
not the same for any two individuals. The best arrangement, 
however, is probably the order of the elements according to the 
number of individuals who pass each, beginning, of course, with the 
easiest. In order to determine this ranking, the number of in- 
dividuals who failed in each element was foimd during the scoring, 
for the Spelling, Arithmetic, S3aion3Tns, Proverbs, and Relation 
Tests. To give an idea of the distribution of difficulties of the ele- 
ments of these five tests in Scale 1, Fig. 3 was made. The horizon- 
tal position of each circle represents the nxmiber of individuals who 
failed in a given element. The circles at the left, therefore, repre- 
sent easy elements. It is apparent from this as well as other sources 
that the Spelling Test is too easy for this group of individuals. 
The elements should be of such difficulty that the median element, 
in difficulty, is passed by about 50% of the group. The Synonym 
Test is somewhat too easy. The Arithmetic problems appear to 
fall into two distinct groups in difficulty. Problems of medium 
difficulty should be substituted for some of the others. The distri- 
butions of difficulty in the Relation and Proverb Tests are, per- 
haps, fairly satisfactory. 

The Diagnostic Value of the Single Test Elements. It is not deemed 
within the scope of this study to investigate the value of each element 
of each test, as for example, a single problem in Arithmetic, as a 
measure of general ability such as is measured by the total point 
score of an individual, or of general arithmetical ability as measured 



Spelling 

Synonyms 

^riThmeTi'c 

ReloTcon 

Proverbs 
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by the arithmetic score. However, as suggestive of means by which 
this may be done, we have examined the sixteen elements of Arith- 
metic Test I with the view to discovering which were the most 
suitable to be included in a test designed to be part of a scale for 
measuring general ability. 

The method employed was as follows: The 121 individuals were 
first ranked in order of their total point scores. The papers of the 
121 individuals in the Arithmetic Test I were then arranged in the 
same rank order. The sequence of passings and failings of Prob- 
lem 1, Problem 2, etc., were noted. TJiese are represented in 
Fig. 4 for each of the 16 problems and for the 121 individuals. There 
is one dotted line for each problem; each dotted line contains 121 
units, one for each individual in the order of their total point scores. 
The presence of a unit of line indicates a problem correct; the ab- 
sence of a unit line indicates a problem failed. The lines are ar- 
ranged in the order of difficulty of the problems 'beginning with the 
easiest, the numbers of the problems represented are given at the 
left. The ilumber of passes for each problem is represented by the 
position of a small circle on the line. In this figure the relative 
values of the problems as measures of general ability are shown by 
the relative amounts of overlapping of passes and failures — the 
greater the overlapping, the less the diagnostic value of the problem. 

If the range of abilities of the individuals tested had been suffi- 
ciently broad so that the complete range of overlapping was repre- 
sented for each problem, it would be a comparatively simple matter 
to express the relative diagnostic value of each problem by a single 

Fig. 3 
Showing the Distributions of Difficulties of the Elements of Five Tests 



so 



28 



THE JOURNAL OF EDUCATIONAL PSYCHOLOGY 



niraiber. For example, let us suppose that no individuals having 
measures of general ability above those represented in the figure 
would fail in either of Problems 7 and 8 and that no individuals 
having measures of general ability below those represented would 
solve either of these problems. If there were no overlapping in 
either case a full line would extend just to the circle representing 
the total number of passes, these being 94 and 55 respectively. 
Since in line 7 there are 10 failtires before and 10 passes after the 
circle, we could represent the amount of overlapping by the number 
10. Similarly since in .line 8 there are 23 failures before and 23 
passes after the circle, could represent the amount of overlapping 
in this case by the number 23. A rank order of the problenas ac- 
cording to these numbers would, under the conditions mentioned 
above, give a serviceable indication of the comparative diagnostic 
values of the problems. 



I 1 1 1 1 1 1 1 

B ■> 

o 

I 6 — o - — - 

a 
o 
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Fig. 4 
Showing the Rank Order in Intelligence of the Individuals who Passed Each Problem 

of Arithmetic Test I. 
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Fig. 5 
Showing (he Diagnostic Value of Each Problem of Arithmetic Test I 
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Inasmuch, however, as only a portion of the overlapping is re- 
presented in each case, it becomes necessary to adopt further means 
of ascertaining the relative diagnostic values. A method offering 
greater refinement is illustrated in Fig. 5. In this figure each 
group of seven circles pertains to one problem. The heighs of the 
first circle, according to the scale at the left, represents the number 
of passes by the first 30 individuals in rank order. The height of 
the second circle represents the number of passes by individuals 
16 to 45, inclusive, the third circle, individuals 31 to 60, etc., em- 
bracing 30 individuals in each group, the last group being 91 to 120. 
There is, of course, a tendency in each case for the succeeding num- 
bers of passes to decrease. Theoretically, there should be a tend- 
ency for the circles to lie in a smooth curve of the form of an ogive. 
The steepness of the curve would indicate the degree of diagnostic 
value of the problem. The merit of this method is that it is pos- 
sible in many cases to obtain a fairly good idea of the true slope 
of the curve for a given problem from only partial data. Curves 
have been drawn in what was judged by the eye to be approximately 
the true position of the curve. The problems may now be ranked 
in diagnostic value according to the slope of the curve at the 50% 
point. Thus it will be seen that the diagnostic value of Problem 8 
appears to be the best. The value of Problem 1, of course, cannot 
be found. Quite possibly it would be as good as the average for 
lesser degrees of ability. For this group of individuals, of course, 
it has no value except perhaps for illustrative purposes. With the 
exception of Problems 6 and 7, possibly of 3 and 16, it would seem 
that the diagnostic values of the problems may be considered satis- 
factory. 

It should be noted that the horizontal position of the point at 
which the curve crosses the 50% line affords a refined measure of 
the degree of general ability to which the ability to solve the prob- 
lem corresponds. Thus, Problem 8, for example, may be consider- 
ed "standard" for the degree of general ability slightly less than 
that of the 90th individual in rank order, say of the 93rd; or of 600 
points. This method is practically that suggested for the standard- 
ization of tests of the Binet Scale (see Ref. 15). To thus express the 
degrees of difficulty of the several problems in terms of the Absolute 
Scale would assist in making the increments of difficulty between 
the problems equal. Such an amount of refinement, however, is 
not considered to be of great value until more of the obvious de- 
fects of the scale have been eliminated. 
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To detennine the relative values of the problems as measures of 
"arithmetical ability," it would be necessary, of course, merely to 
rank the papers in the order of the arithmetic scores instead of the 
total scores. 

IX Inter-Test Correlations 

Considering the fact that double measures were obtained of each 
ability tested and that the number of individuals was comparatively 
large, it was deemed valuable to obtain the inter-test correlations. 

These are given in Appendix IV. They are correlations between 
measures obtained in each case by combining the results of the two 
tests of a kind. There are given both the raw coefficients and those 
corrected for attenuation due to errors of measurement. The 
formula used for correcting for attentuation was as follows (see 
Ref. 3). 

^ab(raw) 
'^abCcorrected) ~ / -— 

^aa Tbb 

It is considered necessary to leave the discussion of the inter- 
correlations to a later article. 

X. Further Considerations Regarding Reliability 

The Reliability Coefficient of the Point Scale. To find the reliability 
coefficient of correlation between Scale I and Scale II, we may pro- 
ceed as follows. Let us call the ten tests of Scale I ai, bi, Ci, ji, 

and those of Scale II, ag, bz, Cg, - J2. Then by the formula 
(See Ref. 13) for the correlation of the sums of several variables, 
the standard deviations of the distributions of scores in the sev- 
eral tests having been made equal, 

r(a.+b,+c,+ - - +j,) (a.,+b,+c,+ - - +j,) = 

ra.a,+ra,b,+ra,c,+ ■ • +rb,a, +I'b,b, +rb,c, + • ■ •+rc,a,+rc,b,+l'c,Ca+ • • •. 

,- i i) 

i^lO-H2(ra.b.+ra.c.+...+rb.c.+.-) ^10+2(r3;b,+ra,c.+...+rb,c,+...) 

= (Sum of 10 reliability coefficients, r^^^^, etc.) -|- 2 (Sum of 45 coef. 
of intercorrelation, ra^b,> etc.) 

^10+2(Sum of 45 coefs. of intercor., ra^b,, etc.) 1^10+ 2 (Sum of 45 
coefs. of intercor., ra,b,. etc.) 

But since the correlations, ra,b,, Ta^b,, and ra^b,. tend to be equal, we 
may take the simis of each of the three sets of 45 coefficients of 
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correlation to be equal to one another. We may therefore simplify 
the equation thus: 

_ ^:cx+2Sr^ . . 

in which r^y =raa+rbb+rcc+ . . . 

and rxx=rab + rac+rad+"-+rbc+--- 

Since the intercorrelations were found between double measures 
in each test, it is necessary^ to express the reliability coefficients also 
in terms of double measures. As these were found in terms of single 
measures, each has been transmuted into terms of double measures 
by means of the formula,* 

r2a2a = , /f w) 

-•■"rraa 

in which r2a2a .and raa are respectively the correlations between 
double measures and between single measures of any abilities. The 
reliability coefficients in each test in terms of double measures are 
shown in the fourth colximn of Table III. Their sum (2rxx) is 
shown to be 8.877. The sum of the 45 coefficients of intercorrela- 
tion multiplied by 2 = 56.478 (=2Srxy). Solving formula 2 above, 

8.877+56.478 „„„ 

i(Scale I,) (Scale II,) JO+56 478 '"•^°'^- 

This is when Scale I2 and Scale II2 are considered as double scales. 
To find the reliability coefficients of correlation between Scale I 
and Scale II as single scales, i. e., each composed of the tests truly 
comprising it (call these Scale I* and Scale II*), then, according to 
the formula,! 

^ r2a2a 
2 — r2a2a 

in which raa and r2a2a have the same meanings respectively as 
before, 

•983 „„„ 

r(ScaIeI,) (Scale II,) = 0_ Qoo = •""' 

This, then, is the reliability coefficient of the Point Scale. 

The Probable Error of the Point Scale 
As has been shown (see Ref. 11) 

r = l-^ (1) 

"dist. 



*This formula is a corrollary to formula 1 above. 
fThe inverse of formula 3. 
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in which r is the reliability coefficient of correlation between two 
series made up of pairs of measures, ai, azi bi, bg; Cj, C2; etc.; in 
which o-^ is the standard deviation of the errors of measurement of 
a, b, c, etc.; and in which o-aj^t is the standard deviation of the 
distribution of values, a, b, c, etc., in either series. 
From equation 1 it follows that 

-^ = 1 - r whence o-^ = (1 _ r) o-|i,t. 

" dist. 

•^^=^1^17^, and P.E. = .6745/TZ7^ (2) 

in which P. E. is the probable or median error of measurement. 

If, now, we consider r to be the reliability coefficient of the Point 
Scale, p. E. as the probable error of measurement by either Scale I 
or Scale II, and o-^i^t as the standard deviation of the point scores 
by the same scale, then we may solve equation 2 for the probable 
error of the scale. The standard deviation of the distribution of 
scores by the scale (average of both) was found to be 111 points. 
Solving equation 2, 

P. E. = .6745 t/1 - .968 x 111 = 13.7 
The Probable Error of the Point Scale, therefore, is 13.7 points. 
, This is 2.7% of the median score (500) of the whole group, or 
3.0% of the total range of scores (461). 

To view the reliability of the scale from another angle we may 
determine as nearly as possible the probable error of a mental age 
by the scale. Thus, if we may assume that the reliability co-eflficient 
of correlation between mental ages by the scale is approximately 
equal to the reliability coefficient of correlation between the point 
scores and that the distribution of mental ages by the scale is ap- 
proximately equal to the distribution of ages, then we may let P. 
E., in formula 2 above, represent the probable error of a mental 
age by the scale, r represent the reliability coefficient of correlation 
between mental ages by the two scales, and o-dist. represent the 
standard deviation of the distribution of mental ages. Then sub- 
stituting in equation 2 the approximate values of these quantities, 
the- standard deviation of the distribution of ages being 28.1 months, 

we have 

P. E. = .6745 VI - .967 X 28.1 = 3.44 
The Probable Error of a mental age by the Point Scale may be 
considered, therefore, as approximately 3}4 months. This is prac- 
tically the same as the probable error of a mental age by the Stan- 
ford Revision of the Binet Scale (see Ref. 11)" 
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XI. Comparisons with School Mark and Amount of Schooling 

Correlation of Total Score with School Mark. The teacher of each 
of the grades furnished for each pupil a final mark representing the 
relative character of his school work for the year. As the marks 
given by the different teachers were not comparable, a separate 
correlation was made for each grade between the school mark and 
the total point score. The coefficients were found for the fourth, 
sixth, and eighth grades to be respectively .80, .41, and .50. Since 
we have no measures of the reliability of the teachers' marks we 
are unable to determine the probable true correlation between in- 
telligence, as measured by the scale, and school performance. 

The Relation of the Coefficient of Brightness to the Amount of School- 
ing. The pupils were asked to tell the length of time they had 
spent in each grade. From these data, the total amounts of time 
spent in school was found for each pupil. For convenience, the 
121 pupils were then classed together in groups according to the 
amount of retardation or advance.* The number of the pupils of 
each class are shown in Table VII. Here again we have no measure 
of the reliability of the reports of schooling and are therefore unable 
to determine the value of the results. There is, however, a very 
definite tendency for advanced pupils to obtain high coefficients of 
brightness and for retarded pupils to obtain low coefficients of bright- 
ness. 

TABLE VII 
Showing the Amounts of Retardation and Advance and their Relation to the Coefficients 
of Brightness. {Data from 10 pupils were missing.) 

Retarded At Advanced 

3 yrs. 2 yrs. 1 yr. Grade 1 yr. 2 yrs. 

Number of 

Pupils 2 8 33 56 8 4 

Average Coef. 

Brightness 77 88 96 102 103 114 



*These terms refer here merely to the taking of more or less than the normal time 
to reach the present grade; they are irrespective of the pupils' ages of entrance.. 
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Test 5: Proverbs 
Pr verbs 
( 3 ) Make hay while the sun shines. 
( ) In a calm sea every man is a pilot. 

Statements to explain the proverbs. 

1. Deeds show the man. 

2. Leadership is easy when all goes well. 

3. Make the best of your opportunities. 

Test 6: Disarranged Sentences 

1 . name a John is boy's (true false ) 

2. sun morning the the in sets (true false ) 

3 . trees birds nests the in build (true false ) 

Test 7: Relations 

1. hand : arm : : foot : ( ) 

2. hat : head : : thimble : ( ) 

23. education : ignorance : : ( ) : poverty 

1. 1 leg, 2 toe, 3 finger, 4 wrist, 5 elbow. 

2. 1 finger, 2 needle, 3 thread, 4 hand, 5 sewing. 

23. 1 laziness, 2 school, 3 wealth, 4 charity, 5 teacher. 

Test 8: Geometric Test 

m 

{Designs were presented composed of two or more geometrical figures — circles, rectangles, 
and triangles — overlapped.) 

1 . Place a figure 1 so that it will be both in the rectangle and in the circle. 

7. Place a figure 1 so that it will be in both circles, in the triangle, and in only 
one rectangle. 

Test 9: Following Directions 
{A page oj Woodworth and Well's Cancellation Test was supplied each pupil.) 

2. In line 1 [of the forms! place a figure 1 in the first star and a figure 2 in the sec- 
ond circle. 

In line 5, place a figure 7 in the form which follows the same kind of form as 
that which follows it. 

Test 10: Narrative Completion 

Once upon a there was ay who was very p 

He went from place to trying to find 
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Appendix II. 
Showing the Point-Scores of each individual in each test. (Pupils 1 to 41, eighth 
grade; 51 to 96, fourth grade; and 101 to 146, sixth grade.) 



Pupil 

Spell. 

Arith. 

Synon. 

Digit. 

Proverb. 

D. Sen. 

Rel'n. 

Geom. 

Fol. D. 

Compl. 

Sums 
C. B. 



10 



11 



12 



14 



15 



16 



77 
76 
76 
80 
81 
68 
61 
52 
67 



718 
109 



55 
54 
58 
76 
54 
72 
53 
55 
58 
62 



597 
93 



68 
67 
64 
41 
65 
63 
71 
58 
65 
64 



626 
100 



56 
49 
50 
45 
35 
63 
44 
55 
48 
58 



503 
86 



63 
51 
57 
43 
46 
68 
53 
58 
56 
68 



563 
89 



64 
76 
67 
62 
62 
75 
49 
58 
51 
58 



622 
108 



59 
56 
72 
53 
65 
63 
69 
66 
67 
65 



62 
58 
59 
50 
54 
72 
69 
69 
61 
68 



65 
73 
57 
41 
68 
60 
67 
74 
58 
68 



57 
56 
60 
50 
60 
59 
69 
66 
72 
72 



54 
72 
56 
62 
52 
71 
67 
64 
58 
48 



70 
60 
75 
32 
68 
53 
75 
58 
75 
68 



635 
108 



622 
99 



631 
103 



621 
103 



604 
102 



634 
108 



72 
64 
69 
45 
71 
72 
73 
61 
75 
83 



685 
110 



53 
44 
47 
30 
35 
59 
41 
45 
43 
56 



453 

74 



63 
62 
64 
36 
62 
63 
63 
61 
56 
58 



588 
96 



Pupil 

Spell. 

Arith. 

Synon. 

Digit. 

Provb. 

D. Sen. 

Rel'n. 

Geom. 

Fol. D. 

Compl. 

Sums 
C.B. 



18 



19 



20 



22 



23 



24 



25 



26 



27 



28 



29 



30 



32 



33 



64 
66 
58 
48 
52 
67 
67 
55 
58 
54 



70 
58 
67 
62 
71 
63 
• 61 
69 
67 
63 



67 
49 
65 
50 
60 
60 
55 
52 
38 
50 



65 
64 
64 
60 
65 
70 
75 
64 
58 
56 



51 
58 
56 
45 
52 
68 
46 
64 
53 
43 



64 
64 
61 
43 
52 
64 
61 
40 
53 
63 



56 
70 
55 
30 
52 
53 
55 
49 
58 
64 



64 
56 
54 
57 
54 
73 
53 
66 
53 
55 



54 
66 
65 
76 
60 
65 
46 
66 
63 
44 



72 
64 
68 
57 
81 
63 
61 
61 
73 
72 



65 
70 
68 
41 
65 
68 
71 
64 
61 
69 



72 
56 
59 
66 
58 
66 
42 
61 
61 
60 



61 
64 
54 
30 
56 
71 
60 
64 
48 
57 



54 
73 
66 
64 
68 
75 
69 
55 
65 
66 



56 
51 
65 
57 
50 
68 
65 
72 
73 
66 



589 
98 



651 
111 



546 
91 



641 
104 



536 
84 



565 
98 



542 
88 



585 
99 



605 
95 



672 
120 



642 

111 



601 
99 



565 
95 



655 
109 



623 

100 



Pupil 

Spell. 

Arith. 

Synon. 

Digit. 

Provb. 

D. Sen. 

Rel'n. 

Geom. 

Fol. D. 

Compl. 

Sums 
C.B. 



34 



35 



36 



37 



38 



39 



40 



41 



51 



52 



53 



55 



56 



58 



59 



64 
58 
50 
55 
46 
49 
58 
29 
43 
•54 



64 
60 
53 
45 
50 
59 
46 
52 
61 
56 



64 
67 
74 
75 
68 
68 
61 
72 
72 
62 



54 
77 
48 
75 
56 
70 
56 
66 
53 
66 



68 
70 
72 
53 
56 
68 
60 
61 
70 
66 



51 
67 
63 
36 
52 
65 
75 
58 
51 
60 



70 
70 
46 
64 
71 
65 
75 
66 
58 
64 



64 
79 
73 
78 
75 
63 
82 
72 
75 
71 



27 
44 
19 
41 
40 
48 
26 
29 
29 
27 



32 
27 
35 
36 
31 
38 
32 
36 
29 
36 



39 
39 
32 
68 
35 
40 
46 
41 
46 
44 



31 
34 
32 
50 
24 
25 
29 
43 
41 
35 



53 
19 
25 
45 
31 
48 
30 
25 
29 
35 



46 
46 
43 
50 
31 
49 
38 
45 
41 
52 



29 
39 
29 
36 
31 
42 
38 
36 
32 
38 



506 
83 



546 
92 



683 
119 



621 

107 



644 
115 



578 
92 



649 

115 



732 
133 



330 

75 



332 

87 



430 
113 



344 
82 



340 
93 



441 
110 



350 
79 



Pupil 

Spell. 

Arith. 

Synon. 

Digit. 

Provb. 

D. Sen. 

Rel'n. 

Geom. 

Fol. D. 

Compl. 

Sums 
C.B. 



60 



61 



62 



63 



64 



65 



66 



67 



68 



69 



70 



72 



73 



74 



36 


40 


27 


23 


43 


;io 


57 


27 


35 


31 


.38 


27 


44 


28 


4,'i 


22 


32 


21 


31 


22 



388 
99 



271 
67 



40 
39 
29 
34 
31 
32 
25 
38 
38 
23 

329 
73 



43 
32 
37 
41 
24 
37 
31 
22 
29 
30 

326 
90 



30 
36 
32 
55 
35 
41 
12 
28 
35 
24 

328 
85 



41 
32 
47 
57 
48 
35 
55 
49 
46 
59 

469 
112 



46 
39 
62 
48 
68 
46 
44 
55 
48 
41 

497 
145 



53 
49 
48 
45 
56 
41 
44 
55 
48 
61 

500 
120 



37 
36 
50 
57 
40 
45 
33 
23 
38 
45 

404 
94 



29 
23 
31 
,"^0 
31 
27 
25 
30 
35 
22 

303 
61 



31 
44 
29 
50 
31 
40 
49 
45 
41 
39 

399 

87 



35 
36 
27 
53 
24 
40 
38 
41 
29 
26 

349 

77 



39 
44 
40 
43 
45 
40 
56 
58 
43 
43 

451 
111 



38 
4) 
54 
69 
50 
34 
51 
45 
51 
47 

485 

110 



37 
32 
40 
30 
40 
34 
34 
38 
43 
38 

366 
93 
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Pupil 



■75 



76 



77 



78 



79 



80 



81 



82 



83 



84 



85 



86 



87 



88 



89 



45 


32 


42 


40 


36 


50 


33 


46 


35 


34 


44 


29 


44 


51 


15 




41 


27 


27 


32 


36 


46 


36 


32 


44 


39 


34 


36 


49 


49 


34 




50 


34 


27 


28 


43 


51 


28 


30 


49 


33 


28 


39 


36 


54 


46 




34 


24 


64 


64 


45 


57 


39 


39 


64 


55 


22 


22 


43 


45 


50 




48 


38 


24 


35 


31 


43 


40 


43 


52 


31 


38 


40 


38 


35 


48 




45 


25 


28 


30 


43 


54 


32 


45 


45 


38 


23 


32 


40 


45 


44 




42 


36 


32 


34 


41 


53 


31 


46 


55 


39 


32 


33 


38 


42 


29 




61 


52 


25 


38 


33 


64 


38 


41 


45 


26 


43 


45 


52 


41 


37 




43 


29 


29 


43 


46 


61 


41 


41 


51 


35 


29 


46 


51 


32 


35 




34 


37 


33 


37 


48 


66 


35 


39 


42 


33 


39 


. 33 


43 


42 


36 




443 


334 


331 


381 


402 


545 


353 


402 


482 


363 


332 


355 


434 


436 


374 




117 


80 


75 


93 


102 


127 


68 


110 


119 


95 


80 


88 


107 


115 


94 




90 


91 


92 


94 


95 


96 


101 


102 


103 


104 


105 


107 


108 


109 


110 




42 


8 


34 


28 


33 


31 


35 


50 


52 


54 


52 


49 


58 


40 


64 




41 


36 


34 


32 


32 


51 


62 


62 


54 


44 


49 


44 


49 


49 


51 




48 


22 


43 


22 


37 


34 


74 


56 


45 


45 


54 


50 


59 


46 


59 




60 


30 


50 


29 


39 


73 


50 


60 


55 


24 


69 


29 


50 


62 


48 




40 


24 


35 


35 


35 


43 


65 


62 


43 


40 


62 


38 


45 


43 


52 




45 


41 


51 


19 


45 


32 


47 


40 


48 


40 


41 


56 


53 


45 


47 




44 


29 


41 


31 


35 


38 


55 


60 


58 


42 


60 


53 


46 


38 


58 




43 


28 


32 


24 


49 


41 


45 


66 


55 


31 


45 


45 


52 


38 


49 




41 


32 


43- 


29 


38 


32 


58 


75 


56 


25 


56 


43 


61 


46 


51 




49 


23 


38 


26 


32 


33 


44 


71 


59 


45 


45 


39 


72 


58 


50 




453 


273 


401 


275 


375 


408 


535 


602 


525 


390 


533 


446 


545 


465 


529 




122 


59 


85 


71 


85 


93 


117 


117 


87 


75 


99 


80 


99 


83 


118 




111 


112 


114 


115 


118 


119 


120 


121 


122 


123 


124 


125 


127 


128 


129 




55 


44 


61 


47 


51 


70 


36 


41 


56 


47 


51 


59 


56 


68 


56 




51 


44 


54 


39 


51 


56 


56 


46 


41 


44 


51 


58 


79 


49 


49 




49 


40 


62 


51 


41 


75 


43 


50 


55 


54 


53 


45 


72 


51 


52 




62 


45 


69 


50 


27 


75 


57 


55 


41 


32 


53 


45 


57 


68 


60 




58 


56 


75 


56 


38 


71 


56 


56 


50 


45 


48 


58 


65 


46 


52 




40 


45 


49 


56 


51 


55 


33 


41 


61 


45 


48 


26 


56 


47 


48 




44 


53 


58 


48 


23 


69 


48 


55 


46 


56 


60 


71 


73 


58 


55 




45 


38 


81 


49 


41 


77 


55 


43 


47 


52 


47 


69 


81 


61 


49 




48 


48 


70 


38 


32 


78 




61 


46 


51, 


58 


53 


72 


51 


65 




45 


37 


69 


39 


44 


72 


S4 


52 


56 


45 


48 


48 


61 


52 


58 




497 


450 


648 


473 


399 


698 


494 


500 


499 


471 


517 


532 


672 


551 


544 




106 


88 


132 


92 


64 


168 


96 


93 


87 


115 


102 


106 


144 


112 


127 




130 


131 


132 


133 


134' 


135 


136 


138 


139 


140 


141 


142 


143 


144 


145 


146 


64 


55 


53 


45 


64 


46 


52 


47 


44 


47 


51 


49 


40 


52 


61 


59 


49 


64 


69 


60 


41 


39 


51 


54 


58 


46 


39 


51 


62 


67 


51 


56 


73 


62 


62 


55 


48 


43 


48 


43 


53 


55 


56 


55, 


52 


48 


63 


64 


43 


50 


34 


64 


60 


53 


69 


34 


75 


62 


48 


53 


30 


53 


48 


62 


81 


65 


62 


58 


54 


40 


40 


52 


60 


65 


60 


58 


46 


45 


71 


71 


54 


58 


52 


46 


49 


38 


49 


45 


41 


65 


44 


43 


46 


49 


52 


57 


61 


60 


55 


58 


48 


39 


48 


56 


60 


55 


49 


36 


58 


58 


49 


67 


72 


69 


52 


52 


49 


38 


58 


66 


69 


35 


40 


41 


49 


66 


52 


49 


65 


51 


56 


43 


58 


43 


61 


41 


58 


43 


53 


38 


51 


38 


70 


80 


63 


63 


63 


55 


53 


44 


44 


46 


60 


42 


53 


51 


48 


41 


70 


60 


625 


597 


558 


536 


524 


423 


520 


484 


578 


515 


493 


475 


482 


517 


587 


625 


131 


115 


118 


112 


102 


74 


99 


88 


101 


100 


90 


87 


85 


97 


108 


129 



Appendix III. 

Some Mathematical Reasoning with Regard to Criteria of Tests of 
Intelligence. 

1. If we were to assume that each test measured only a general 
factor — one common to all the tests — and one or more factors 
specific to that test alone, then the relative degrees in which two 
tests correlate with the general factor, are expressed, subject to the 
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chance errors of the coefficients, by the relative degrees to which 
these two tests correlate with the other tests. This may be shown 
as follows. By formula for partial correlation, 

'^'=-' = V(l - tid (1 - r?0 
in which a and c are tests, i is a hypothetical, perfect measure of 
the general factor, rac is the coefficent of correlation between a 
and c, etc., and rac.i is the coefficient of correlation between a 
and c which is due to factors other than the general one. But since 
by hypothesis, the general factor in the only source of correlation, 



raci = 0. 

Then 


Tac = Tai Td 




and similarly, 
Therefore 


Tbc = Tbi rci 
^ac J'ai 
Tbc Tbi 




Similarly, 


£ai _ {ad _ ^e _ 


_^f_ 



etc. 

Tbi Tbd The ^bf 

An expression for the combined value of these ratios is given very 
approximately by the ratio of the average intercorrelation between 
a and the other tests to the average intercorrelation between b and 
the other tests. 

2. Let us consider now a case in which there is no factor common 
to all the tests in the group. To take a very simple example, let 
us suppose we have four tests (nos. 1, 2, 3, and 4) testing abilities 
each of which is made up of five of the nine elements. A, B, C, D, 
E, F, G, H, and I, distributed as follows. 

Test 1, A B C D E 

Test 2, A B C D F 

Test 3, C D E F G 

Test 4, B E F H I 

Here it will be noted that no element is common to more than three 
abilities. Now the coefficient of correlation between two series of 
values is a measure of the percentage of elemental causes common 
to both.* And since the niimber of elemental causes common to 

*For example, if five coins are tossed n times and each time the number of heads 
appearing is recorded, and if after each independent tossing, one coin is left lying, 
the other four tossed again, and the number of heads then appearing is recorded; 
then as n approaches infinity, the coefficient of correlation between thi number of 
heads appearing by the independent tossing and the number of heads appearing by 
the dependent tossing approaches .20, attesting to the fact that one fitfh of the 
causes affecting the number of heads in each throw (one coin in five) was common to 
I oth throws of a pair. If two of the five coins are left lying, the correlation will 
approach .40, if three are left lyin?, .60, if four, .80, and if five, of course, 1.00. Sim- 
ilarly for other numbers of coins and similarly for elements of abihties. 
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abilities, 1 and 2, is four out of five, the correlation between tests, 
1 and 2, will tend to be .80. Three elements are common to abilities, 
1 and 3, and 2 and 3. Therefore the correlations between tests 1 
and 3 and between 2 and 3 will tend to be .60. And so on. The 
correlation table will therefore appear as follows. 

Tests 12 3 4 



1 




.80 


.60 


.40 


2 


.80 




.63 


.40 


3 


.60 


.60 




.40 


4 


.40 


.40 


.40 





Sums 1,80 1.80 1.69 1.20 

A table showing the number of elements common to each pair 
of abilities would appear as follows. 

Tests 12 3 4 

1 4 3 2 

2 4 3 2 

3 3 3 2 

4 2 2 2 

5 

Sums 9 9 8 6 

It may be seen from this table that the number of times the ele- 
ments of ability, 1, appear in the other three abilities is 9. The 
correlation spread of ability, 1, may therefore be said to be re- 
presented by the number 9. The number of times the elements of 
abilities, 2, 3, and 4, appear in the other three abilities are respec- 
tively 9, 8, and 6. We may say, then, that the relative values of 
the correlational spreads of the four tests are as 9 : 9 : 8 : 6. 
Now it may be seen that these are exactly the same proportions 
as 1.80 : 1.80 : 1.60 : 1.20, the sums of the coefficients in the first 
table. The latter values are equal respectively to the former values 
when each is divided by 5, the number of elements in each ability. 
Thus it may be seen that the sums of the correlations of each of 
the tests with all of the others afford meaures of the relative cor- 
relational spread of the tests. 

3. The coefficient of correlation between any one weighted test 
and the weighted composite of a number of tests may be found 
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from the coefficients of intercorrelation and the weights by the 
formula (see Ref. 13) which may be stated in general as follows. 

rwaCXib.+Xjbj+Xjb, ...)"" ,— = 

VZxo-g + il'XXcrbCrbrbb 

in which ui, x,i x,2 etc., are the weights given to the tests, a, &,, &2. 
etc., and a■^, is the standard deviation of the scores of any test, b. 

McCall's procedure, therefore, might have been to consider bu 
hi, etc., as representing the tests which he wished to embody in 
his Composite; to consider Xi, Xi, etc., as representing the respec- 
tive weights to be given these tests, and a as representing any test 
it was desired to correlate with the Composite. The correlation 
could then have been obtained by solving the equation. The gen- 
eral formula, is equally applicable, of course, for finding, from the 
intercorrelations, the correlation of a test with the average of all 
the oih^y tests. 

If only the relative values of the correlations of each test with a 
composite of weighted tests is desired, these may be obtained more 
simply yet; thus, assuming that there were only three tests, a, b, 
and c, in the group, weighted respectively, w, x, and y, then the cor- 
responding formula for the correlation of test a, with the weighted 
composite would be 

_ wSj^+xSi,r^b+yS,r^, 

'^ wa(wa+xb+yc) ^^ZZ^Z^ZHZI^^I^^Z^^^^I^i^Z^IIIIZZIZZIIIZ^ZIZZ^^^r 

VwV|+xVg+yV2+2(wo-^X(rbr»b+wtr^yo-,r^+x2by2,rbe) 
Similarly, r^bcwa+xb+yo = wSar^b -|-x2i,rbb +Y'2ijbc 

same denominator 
And ryc(wa+xb+yc) = wSarac+xSbrbcH-yScrcc 



same denominator 



Since the denominators are the same in all cases, it may be seen that 
the relative values of the correlations of the several tests with the 
weighted composite are directly proportional to the sums of the in- 
tercorrelations of those tests, each with all the tests, when these 
intercorrelations have been weighted as shown in the numerators. 
And if the standard deviations have been made equal, this merely 
means weighting the several coefficients by the same weights in and 
the same order as they would appear in the composite. The same 
reasoning holds for any nimiber of tests. 

4 If it is desired, on the other hand, to find the relative or ab- 
solute amounts of the average intercorrelations of each of a series 
with all the others, (weights and standard deviations being equal) 
as a criterion of the degree to which each tesi? measures the common 
factor; and if the values of the separate intercorrelations are not 
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required; it will be more convenient to derive these average inter- 
correlations from the correlation of each test with the average of 
all the measures taken together as a composite. That this may be 
done is shown as follows. 

Repeating the proof given in 3 above in a simpler form, let us 
assume again for the moment that there are only three tests, a, b, 
and c, in the series; then by the formula for the correlation of one 
test with the average of a number of tests (assuming weights and 
standard deviations equal), 

(1) ra(a_(.b+c) = faa+rab+rac 



V3+2(rab+rac+rbc) 

(2) rb(a+b+c)= rba+rbb+fbc 

V3+2(rab+rac+rbc) 

(3) rc(a+b+c) ~ Tca+rcb+rcc 

\/3 + 2(rab+rac+rbc) 
Letting 2ra,b,c(a+b+c) represent ra(a+b+c)+rb(a+b+c)+rc(a+b+c), and since 

raa ^ ^bb ^ * cc ^j 

(4) 2ra,b,c(a+b+c) = _3+2(rab+£ac+^bc)_ 

V3+2(rab+rac+rbc) 

(5) Sra,b.c(a+b+c) = V3+2(rab+rac+rbc) 
Multiplying equation 1 by equation 5, we have 

Taa+rab+^ac = ra(a+b+c) ^ Sra,b,c(a+b+c) 

and similarly for all other correlational sums. Thus it may be 
seen that the absolute amounts of the sums or averages of the in- 
tercorrelations of any test with all the tests in the series may be 
derived from the values of the correlations of each test with the 
composite (weights and standard deviations being equal) without 
the individual test intercorrelations being found. The same reas- 
oning holds for any number of tests. As a criteria of the degree 
to which any test measures the factor common to the group of 
abihties tested, the test's average intercorrelation with all the 
tests and its correlation with the composite are of equal value, being, 
in fact, the same criterion. The sums of the intercorrelations of 
any test with all the other tests (excluding itself) may be obtained, 
of course, by subtracting 1.00 (the correlation of the test with it- 
self) from the sum of the intercorrelations with all the tests. 
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